Search CORE

108 research outputs found

The effect of minor allele frequency on the likelihood of obtaining false positives

Author: AC Lam
IP Gorlov
JC Florez
Jessica G Woo
KG Ardlie
LA Cupples
Lisa J Martin
Meredith E Tabangin
V Moskvina
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Determining the most promising single-nucleotide polymorphisms (SNPs) presents a challenge in genome-wide association studies, when hundreds of thousands of association tests are conducted. The power to detect genetic effects is dependent on minor allele frequency (MAF), and genome-wide association studies SNP arrays include SNPs with a wide distribution of MAFs. Therefore, it is critical to understand MAF's effect on the false positive rate

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

GWAS Meets Microarray: Are the Results of Genome-Wide Association Studies and Gene-Expression Profiling Consistent? Prostate Cancer as an Example

Author: AR Ramjaun
Christopher Amos
Christopher J. Logothetis
D Duggan
DW Huang
E Delva
Eshel Ben-Jacob
G Thomas
Gary E. Gallick
H Goel
IP Gorlov
IP Gorlov
Ivan P. Gorlov
J Gudmundsson
L Lacroix
M Fornaro
M Piao
M Takkunen
MC Brown
MD Hansen
MD Mason
MI McCarthy
Olga Y. Gorlova
PA Konstantinopoulos
R Chen
R Rosenthal
RK Nam
S Etienne-Manneville
SA Ochsner
SJ Moschos
SR Browning
T Bao
VM Bazas
XS Ke
Publication venue: Public Library of Science
Publication date: 01/08/2009
Field of study

Genome-wide association studies (GWASs) and global profiling of gene expression (microarrays) are two major technological breakthroughs that allow hypothesis-free identification of candidate genes associated with tumorigenesis. It is not obvious whether there is a consistency between the candidate genes identified by GWAS (GWAS genes) and those identified by profiling gene expression (microarray genes).We used the Cancer Genetic Markers Susceptibility database to retrieve single nucleotide polymorphisms from candidate genes for prostate cancer. In addition, we conducted a large meta-analysis of gene expression data in normal prostate and prostate tumor tissue. We identified 13,905 genes that were interrogated by both GWASs and microarrays. On the basis of P values from GWASs, we selected 1,649 most significantly associated genes for functional annotation by the Database for Annotation, Visualization and Integrated Discovery. We also conducted functional annotation analysis using same number of the top genes identified in the meta-analysis of the gene expression data. We found that genes involved in cell adhesion were overrepresented among both the GWAS and microarray genes.We conclude that the results of these analyses suggest that combining GWAS and microarray data would be a more effective approach than analyzing individual datasets and can help to refine the identification of candidate genes and functions associated with tumor development

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Evaluation of association tests for rare variants using simulated data sets in the Genetic Analysis Workshop 17 data

Author: AL Price
B Li
BE Madsen
C Dering
Chuanyu Sun
D Nettleton
Degui Zhi
G Kang
Guimin Gao
IP Gorlov
Jiexun Wang
JN Hirschhorn
LA Almasy
LD Brown
MD Ernst
Nianjun Liu
Wellcome Trust Case Control Consortium
Wen Wan
Wenan Chen
Xi Gao
Xiangning Chen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

We evaluate four association tests for rare variants—the combined multivariate and collapsing (CMC) method, two weighted-sum methods, and a variable threshold method—by applying them to the simulated data sets of unrelated individuals in the Genetic Analysis Workshop 17 (GAW17) data. The family-wise error rate (FWER) and average power are used as criteria for evaluation. Our results show that when all nonsynonymous SNPs (rare variants and common variants) in a gene are jointly analyzed, the CMC method fails to control the FWER; when only rare variants (single-nucleotide polymorphisms with minor allele frequency less than 0.05) are analyzed, all four methods can control FWER well. All four methods have comparable power, which is low for the analysis of the GAW17 data sets. Three of the methods (not including the CMC method) involve estimation of p-values using permutation procedures that either can be computationally intensive or generate inflated FWERs. We adapt a fast permutation procedure into these three methods. The results show that using the fast permutation procedure can produce FWERs and average powers close to the values obtained from the standard permutation procedure on the GAW17 data sets. The standard permutation procedure is computationally intensive

Crossref

Springer - Publisher Connector

PubMed Central

VCU Scholars Compass

Novel genetic variants in miR-191 gene and familial ovarian cancer

Author: D Ford
DF Easton
GA Calin
HT Lynch
HT Lynch
Hua Zhao
IP Gorlov
J Shen
J Shen
Jie Shen
K Jazdzewski
Kunle Odunsi
L Zhang
N Iovino
R Duan
R Yancik
Richard DiCioccio
S Volinia
SA Narod
Shashikant B Lele
T Pejovic
T Xu
Y Zheng
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Analysis of human mini-exome sequencing data from Genetic Analysis Workshop 17 using a Bayesian hierarchical mixture model

Author: A Luedtke
AE Raftery
AE Raftery
C Dering
Corinne D Engelman
EI George
Gota Morota
IP Gorlov
Julio S Bueno Filho
Kristin J Meyers
Lina M Vera-Cala
Matthew J Maenner
N Yi
Quoc Tran
R Development Core Team
SB Ng
T Meuwissen
TA Manolio
TH Meuwissen
THE Meuwissen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Next-generation sequencing technologies are rapidly changing the field of genetic epidemiology and enabling exploration of the full allele frequency spectrum underlying complex diseases. Although sequencing technologies have shifted our focus toward rare genetic variants, statistical methods traditionally used in genetic association studies are inadequate for estimating effects of low minor allele frequency variants. Four our study we use the Genetic Analysis Workshop 17 data from 697 unrelated individuals (genotypes for 24,487 autosomal variants from 3,205 genes). We apply a Bayesian hierarchical mixture model to identify genes associated with a simulated binary phenotype using a transformed genotype design matrix weighted by allele frequencies. A Metropolis Hasting algorithm is used to jointly sample each indicator variable and additive genetic effect pair from its conditional posterior distribution, and remaining parameters are sampled by Gibbs sampling. This method identified 58 genes with a posterior probability greater than 0.8 for being associated with the phenotype. One of these 58 genes, PIK3C2B was correctly identified as being associated with affected status based on the simulation process. This project demonstrates the utility of Bayesian hierarchical mixture models using a transformed genotype matrix to detect genes containing rare and common variants associated with a binary phenotype

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Disease risk prediction with rare and common variants

Author: A Dasgupta
ACJW Janssens
AL Gloyn
AL Price
Andrew T DeWan
B Li
C Cortes
C Dering
Chengqing Wu
DH Ballard
IJ Kullo
IP Gorlov
J Asimit
Josephine Hoh
Kyle M Walsh
L Almasy
NR Wray
P Kraft
Science/AAAS
T Hastie
V Bansal
Z Wei
Zuoheng Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

A number of studies have been conducted to investigate the predictive value of common genetic variants for complex diseases. To date, these studies have generally shown that common variants have no appreciable added predictive value over classical risk factors. New sequencing technology has enhanced the ability to identify rare variants that may have larger functional effects than common variants. One would expect rare variants to improve the discrimination power for disease risk by permitting more detailed quantification of genetic risk. Using the Genetic Analysis Workshop 17 simulated data sets for unrelated individuals, we evaluate the predictive value of rare variants by comparing prediction models built using the support vector machine algorithm with or without rare variants. Empirical results suggest that rare variants have appreciable effects on disease risk prediction

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PoolHap: Inferring Haplotype Frequencies from Pooled Samples by Next Generation Sequencing

Author: A Fagotti
Chris Tyler-Smith
Daniel C. Jeffares
E Nanak
H Jiang
H Li
H Li
HP Liu
IP Gorlov
J Supabandhu
K Khrapko
Kai Ye
LE Fuhrman
M Al-Hajj
M Stephens
Magnus Nordborg
P Medvedev
P Navas
Qingrun Zhang
Quan Long
Thomas Mailund
TL Turner
Viktoria Nizhynska
Zemin Ning
Publication venue: PUBLIC LIBRARY SCIENCE
Publication date: 01/01/2011
Field of study

With the advance of next-generation sequencing (NGS) technologies, increasingly ambitious applications are becoming feasible. A particularly powerful one is the sequencing of polymorphic, pooled samples. The pool can be naturally occurring, as in the case of multiple pathogen strains in a blood sample, multiple types of cells in a cancerous tissue sample, or multiple isoforms of mRNA in a cell. In these cases, it's difficult or impossible to partition the subtypes experimentally before sequencing, and those subtype frequencies must hence be inferred. In addition, investigators may occasionally want to artificially pool the sample of a large number of individuals for reasons of cost-efficiency, e. g., when carrying out genetic mapping using bulked segregant analysis. Here we describe PoolHap, a computational tool for inferring haplotype frequencies from pooled samples when haplotypes are known. The key insight into why PoolHap works is that the large number of SNPs that come with genome-wide coverage can compensate for the uneven coverage across the genome. The performance of PoolHap is illustrated and discussed using simulated and real data. We show that PoolHap is able to accurately estimate the proportions of haplotypes with less than 2% error for 34-strain mixtures with 2X total coverage Arabidopsis thaliana whole genome polymorphism data. This method should facilitate greater biological insight into heterogeneous samples that are difficult or impossible to isolate experimentally. Software and users manual are freely available at http://arabidopsis.gmi.oeaw.ac.at/quan/poolhap/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Prioritizing genes associated with prostate cancer development

Author: A Nikitin
B Gur-Dedeoglu
Christopher J Logothetis
Curtis A Pettaway
DJ Brennan
G Dennis Jr
Hongya Zhao
IP Gorlov
Ivan P Gorlov
J Byun
JF Reid
Jin Young Byun
Kanishka Sircar
L Shen
L Wang
M Dolled-Filhart
M Habeck
M Stanbrough
MP Jansen
ND Price
Nora M Navone
O Gevaert
Olga Y Gorlova
P Yue
Patricia Troncoso
R Edgar
R Lin
R Rosenthal
R Zigeuner
SA Ochsner
Sankar N Maity
T Barrett
T Nakagawa
UR Chandran
YH Yang
Publication venue: BioMed Central
Publication date: 01/11/2010
Field of study

Abstract Background The genetic control of prostate cancer development is poorly understood. Large numbers of gene-expression datasets on different aspects of prostate tumorigenesis are available. We used these data to identify and prioritize candidate genes associated with the development of prostate cancer and bone metastases. Our working hypothesis was that combining meta-analyses on different but overlapping steps of prostate tumorigenesis will improve identification of genes associated with prostate cancer development. Methods A <it>Z </it>score-based meta-analysis of gene-expression data was used to identify candidate genes associated with prostate cancer development. To put together different datasets, we conducted a meta-analysis on 3 levels that follow the natural history of prostate cancer development. For experimental verification of candidates, we used in silico validation as well as in-house gene-expression data. Results Genes with experimental evidence of an association with prostate cancer development were overrepresented among our top candidates. The meta-analysis also identified a considerable number of novel candidate genes with no published evidence of a role in prostate cancer development. Functional annotation identified cytoskeleton, cell adhesion, extracellular matrix, and cell motility as the top functions associated with prostate cancer development. We identified 10 genes--<it>CDC2, CCNA2, IGF1, EGR1, SRF, CTGF, CCL2, CAV1, SMAD4</it>, and <it>AURKA</it>--that form hubs of the interaction network and therefore are likely to be primary drivers of prostate cancer development. Conclusions By using this large 3-level meta-analysis of the gene-expression data to identify candidate genes associated with prostate cancer development, we have generated a list of candidate genes that may be a useful resource for researchers studying the molecular mechanisms underlying prostate cancer development.</p

Crossref

Directory of Open Access Journals

PubMed Central

Effect of BRCA2 sequence variants predicted to disrupt exonic splice enhancers on BRCA2 transcripts

Author: A Zatkova
AA Tesoriero
AB Spurdle
Amanda B Spurdle
Brooke L Brewster
C Bonnet
CA Pettigrew
CA Pettigrew
Christopher A Pettigrew
DJ Farrugia
F Bonatti
HX Liu
IP Gorlov
JD Fackenthal
JT den Dunnen
L Cartegni
Logan C Walker
MA Brown
Melissa A Brown
O Anczukow
P Lastella
Phillip J Whiley
S Millevoi
TC Burn
V Caux-Moncoutier
WG Fairbrother
Y Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Genetic screening of breast cancer patients and their families have identified a number of variants of unknown clinical significance in the breast cancer susceptibility genes, BRCA1 and BRCA2. Evaluation of such unclassified variants may be assisted by web-based bioinformatic prediction tools, although accurate prediction of aberrant splicing by unclassified variants affecting exonic splice enhancers (ESEs) remains a challenge

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Discovery of Rare Variants via Sequencing: Implications for the Design of Complex Trait Association Studies

Author: AL Price
B Kerem
B Li
Bingshan Li
D Azzopardi
D Keen-Kim
David B. Allison
GA McVean
IP Gorlov
JC Cohen
JC Cohen
JK Pritchard
JK Pritchard
JM Van Liere
LR Brunham
LR Cardon
MC King
MI McCarthy
N Ahituv
N Siva
RF Service
RR Hudson
S Romeo
Suzanne M. Leal
TA Manolio
TL Slatter
W Bodmer
W Ji
Publication venue: Public Library of Science
Publication date: 01/05/2009
Field of study

There is strong evidence that rare variants are involved in complex disease etiology. The first step in implicating rare variants in disease etiology is their identification through sequencing in both randomly ascertained samples (e.g., the 1,000 Genomes Project) and samples ascertained according to disease status. We investigated to what extent rare variants will be observed across the genome and in candidate genes in randomly ascertained samples, the magnitude of variant enrichment in diseased individuals, and biases that can occur due to how variants are discovered. Although sequencing cases can enrich for casual variants, when a gene or genes are not involved in disease etiology, limiting variant discovery to cases can lead to association studies with dramatically inflated false positive rates

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central